AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Audio Understanding

# Audio Understanding

Videollama2.1 7B AV CoT
Apache-2.0
VideoLLaMA2.1-7B-AV is a multimodal large language model focused on audio-visual question answering tasks, capable of processing both video and audio inputs to provide high-quality question answering and description generation.
Video-to-Text Transformers English
V
lym0302
34
0
Qwen2 Audio 7B Instruct 4bit
This is the 4-bit quantized version of Qwen2-Audio-7B-Instruct, developed based on Alibaba Cloud's original Qwen model. It is an audio-text multimodal large language model.
Audio-to-Text Transformers
Q
alicekyting
1,090
6
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase